Equivalence of Optimality Criteria for Markov Decision Process and Model Predictive Control

نویسندگان

چکیده

This paper shows that the optimal policy and value functions of a Markov Decision Process (MDP), either discounted or not, can be captured by finite-horizon undiscounted Optimal Control Problem (OCP), even if based on an inexact model. achieved selecting proper stage cost terminal for OCP. A very useful particular case OCP is Model Predictive (MPC) scheme where deterministic (possibly nonlinear) model used to reduce computational complexity. observation leads us parameterize MPC fully, including function. In practice, Reinforcement Learning algorithms then tune parameterized scheme. We verify developed theorems analytically in LQR we investigate some other nonlinear examples simulations.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improved Optimization Process for Nonlinear Model Predictive Control of PMSM

Model-based predictive control (MPC) is one of the most efficient techniques that is widely used in industrial applications. In such controllers, increasing the prediction horizon results in better selection of the optimal control signal sequence. On the other hand, increasing the prediction horizon increase the computational time of the optimization process which make it impossible to be imple...

متن کامل

Learning-based model predictive control for Markov decision processes

We propose the use of Model Predictive Control (MPC) for controlling systems described by Markov decision processes. First, we consider a straightforward MPC algorithm for Markov decision processes. Then, we propose value functions, a means to deal with issues arising in conventional MPC, e.g., computational requirements and sub-optimality of actions. We use reinforcement learning to let an MPC...

متن کامل

Continuous-time Markov decision processes with nth-bias optimality criteria

In this paper, we study the nth-bias optimality problem for finite continuous-time Markov decision processes (MDPs) with a multichain structure. We first provide nth-bias difference formulas for two policies and present some interesting characterizations of an nth-bias optimal policy by using these difference formulas. Then, we prove the existence of an nth-bias optimal policy by using nth-bias...

متن کامل

Constrained model predictive control: Stability and optimality

Model predictive control is a form of control in which the current control action is obtained by solving, at each sampling instant, a "nite horizon open-loop optimal control problem, using the current state of the plant as the initial state; the optimization yields an optimal control sequence and the "rst control in this sequence is applied to the plant. An important advantage of this type of c...

متن کامل

On optimality of nonlinear model predictive control

In this note the Infinite Horizon (IH) optimality property of Nonlinear Model Predictive Control (MPC) is analysed. In particular it is shown with a contra example that the conjecture that the IH cost of the closedloop system controlled with a stabilizing MPC controller is a monotonic decreasing function of the optimization horizon is fallacius.

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEEE Transactions on Automatic Control

سال: 2023

ISSN: ['0018-9286', '1558-2523', '2334-3303']

DOI: https://doi.org/10.1109/tac.2023.3277309